feat(database): Database Filtering via custom configuration #24580

Antonio-RiveroMartnez · 2023-07-03T15:00:11Z

SUMMARY

There might be scenarios where you want to perform custom filtering on the list of databases returned by the DatabaseRestApi, right now, there's no way besides monkey patching that you can do so. This PR adds a new config definition in our app and makes use of it in the DatabaseFilter which is applied to all searches so you can add custom filtering if needed via config, adding live filtering capabilities to our searches and easing customization.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

Add a filtering function in your config file so all GET or GET list requests can make use of it and return a filtered result

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

codecov · 2023-07-03T15:07:06Z

Codecov Report

Merging #24580 (a47bdc2) into master (226c7f8) will decrease coverage by 0.02%.
The diff coverage is 86.06%.

❗ Current head a47bdc2 differs from pull request most recent head dac34ff. Consider uploading reports for the commit dac34ff to get more accurate results

@@            Coverage Diff             @@
##           master   #24580      +/-   ##
==========================================
- Coverage   69.08%   69.06%   -0.02%     
==========================================
  Files        1906     1906              
  Lines       74168    74114      -54     
  Branches     8164     8165       +1     
==========================================
- Hits        51239    51187      -52     
+ Misses      20807    20804       -3     
- Partials     2122     2123       +1

Flag	Coverage Δ
hive	`54.14% <39.87%> (+0.20%)`	⬆️
mysql	`79.48% <83.54%> (+0.08%)`	⬆️
postgres	`?`
presto	`54.04% <39.87%> (+0.20%)`	⬆️
python	`83.46% <86.07%> (-0.03%)`	⬇️
sqlite	`78.13% <70.25%> (+0.08%)`	⬆️
unit	`54.81% <48.73%> (+0.12%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
superset/daos/annotation.py	`100.00% <ø> (+12.76%)`	⬆️
superset/examples/utils.py	`0.00% <0.00%> (ø)`
superset/views/base.py	`73.33% <25.00%> (ø)`
superset/databases/commands/update.py	`74.44% <50.00%> (+0.28%)`	⬆️
...erset-frontend/src/SqlLab/components/App/index.jsx	`82.75% <83.33%> (-0.58%)`	⬇️
superset/extensions/metastore_cache.py	`96.61% <90.00%> (-1.51%)`	⬇️
superset-frontend/src/logger/LogUtils.ts	`97.29% <100.00%> (+0.07%)`	⬆️
...otation_layers/annotations/commands/bulk_delete.py	`87.50% <100.00%> (ø)`
...t/annotation_layers/annotations/commands/delete.py	`83.33% <100.00%> (-1.29%)`	⬇️
superset/annotation_layers/commands/bulk_delete.py	`84.61% <100.00%> (ø)`
... and 30 more

... and 12 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

- Add new configuration so we can inject extra filters to our databases when running the DatabaseFilter in base_filters - Add tests for our new config and its usage

hughhhh · 2023-07-05T16:05:23Z

tests/integration_tests/databases/api_tests.py

+            assert rv.status_code == 200
+
+        # Cleanup
+        first_model = db.session.query(Database).get(first_response.get("id"))


we need to figure out how to do this via fixture or after every test so we can always land back in a normal state

superset/databases/filters.py

Antonio-RiveroMartnez · 2023-07-05T16:17:27Z

tests/integration_tests/databases/api_tests.py

+        uri = f"api/v1/database/"
+        rv = self.client.get(uri)
+        data = json.loads(rv.data.decode("utf-8"))
+        self.assertEqual(data["count"], len(dbs))


@hughhhh here I am testing our current behavior (default) where all databases must be returned if nothing is being set in the config, so dynamic_filter is not defined. Then, I'm adding the patch for the config to add the filter function and testing it's being applied because dynamic_filter is defined.

Added explicit assertions to check whether the filter method has been called when defined.

here I am testing our current behavior (default) where all databases must be returned if nothing is being set in the config, so dynamic_filter is not defined. Then, I'm adding the patch for the config to add the filter function and testing it's being applied because dynamic_filter is defined.

Can you write that down in the function docstring? :)

john-bodley

Could you provide some scenarios where you would use this? I also wonder if this should be generalized for all entity types.

Additionally I’m not saying I’m a supporter of Monkey Patching, but we should be cognizant that sometimes the method is used to relax (as opposed to further restrict) filters whereas this approach only addresses the later.

john-bodley · 2023-07-05T17:19:30Z

superset/databases/filters.py

@@ -41,6 +41,16 @@ class DatabaseFilter(BaseFilter):  # pylint: disable=too-few-public-methods
    # TODO(bogdan): consider caching.

    def apply(self, query: Query, value: Any) -> Query:
+        # Dynamic Filters need to be applied to the Query before we filter


Feels like a good place for a docstring comment.

- Add explicit assertions to validate the filter is not called when not defined - Use docstring comment

Antonio-RiveroMartnez · 2023-07-05T17:56:26Z

Could you provide some scenarios where you would use this? I also wonder if this should be generalized for all entity types.

Additionally I’m not saying I’m a supporter of Monkey Patching, but we should be cognizant that sometimes the method is used to relax (as opposed to further restrict) filters whereas this approach only addresses the later.

There might be cases where I have a given database that I want to make visible/hidden to my users based on a Feature Flag (live change). Right now, our options would be overriding the entire DatabaseRestApi or Monkey Patch it, with this change we extend the options we have, we can define our custom filter method in the config and get it applied it to my responses in a easy way.

When it comes to generalized for all entities, I will use the same concept we use for EXTRA_RELATED_QUERY_FILTERS and define a databases key that would be the one we pull from the DatabaseFilter, that way we can use the config to further extension later on if needed for other entities, i.e, adding a dashboards, charts key in it etc. Also, by doing this renaming the config to EXTRA_DYNAMIC_QUERY_FILTERS.

- Generalize the new config so we can use it for other entities down the road. - Pull the new database key from the config so we apply any given filter method in the databases API - Adjust our tests with the new config name and structure

betodealmeida

Elegant!

betodealmeida · 2023-07-06T19:57:38Z

tests/integration_tests/databases/api_tests.py

+        uri = f"api/v1/database/"
+        rv = self.client.get(uri)
+        data = json.loads(rv.data.decode("utf-8"))
+        self.assertEqual(data["count"], len(dbs))


here I am testing our current behavior (default) where all databases must be returned if nothing is being set in the config, so dynamic_filter is not defined. Then, I'm adding the patch for the config to add the filter function and testing it's being applied because dynamic_filter is defined.

Can you write that down in the function docstring? :)

- Add more info to our comments in our tests

john-bodley · 2023-07-06T21:14:49Z

@Antonio-RiveroMartnez regarding your comment,

There might be cases where I have a given database that I want to make visible/hidden to my users based on a Feature Flag (live change).

If that was the case shouldn't the permission override logic be handled in the security manager? Note I'm not blocking this change, I just think there's merit in making sure we measure twice cut once in terms of adding new functionality.

Antonio-RiveroMartnez requested a review from dpgaspar July 3, 2023 15:00

pull-request-size bot added the size/L label Jul 3, 2023

Antonio-RiveroMartnez force-pushed the extra_database_filter branch from fdea47a to ece4c77 Compare July 3, 2023 15:38

Database Filtering:

579c359

- Add new configuration so we can inject extra filters to our databases when running the DatabaseFilter in base_filters - Add tests for our new config and its usage

Antonio-RiveroMartnez force-pushed the extra_database_filter branch from ece4c77 to 579c359 Compare July 3, 2023 16:02

hughhhh reviewed Jul 5, 2023

View reviewed changes

superset/databases/filters.py Outdated Show resolved Hide resolved

Antonio-RiveroMartnez commented Jul 5, 2023

View reviewed changes

michael-s-molina requested a review from john-bodley July 5, 2023 17:04

john-bodley reviewed Jul 5, 2023

View reviewed changes

Database filtering:

eaddeae

- Add explicit assertions to validate the filter is not called when not defined - Use docstring comment

Database filtering:

a8995d8

- Generalize the new config so we can use it for other entities down the road. - Pull the new database key from the config so we apply any given filter method in the databases API - Adjust our tests with the new config name and structure

betodealmeida approved these changes Jul 6, 2023

View reviewed changes

Database Filtering:

dac34ff

- Add more info to our comments in our tests

Antonio-RiveroMartnez merged commit 6657353 into apache:master Jul 6, 2023
29 checks passed

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(database): Database Filtering via custom configuration #24580

feat(database): Database Filtering via custom configuration #24580

Antonio-RiveroMartnez commented Jul 3, 2023

codecov bot commented Jul 3, 2023 •

edited

Loading

hughhhh Jul 5, 2023

Antonio-RiveroMartnez Jul 5, 2023

Antonio-RiveroMartnez Jul 5, 2023

betodealmeida Jul 6, 2023

john-bodley left a comment

john-bodley Jul 5, 2023

Antonio-RiveroMartnez commented Jul 5, 2023 •

edited

Loading

betodealmeida left a comment

betodealmeida Jul 6, 2023

john-bodley commented Jul 6, 2023

feat(database): Database Filtering via custom configuration #24580

feat(database): Database Filtering via custom configuration #24580

Conversation

Antonio-RiveroMartnez commented Jul 3, 2023

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

codecov bot commented Jul 3, 2023 • edited Loading

Codecov Report

hughhhh Jul 5, 2023

Choose a reason for hiding this comment

Antonio-RiveroMartnez Jul 5, 2023

Choose a reason for hiding this comment

Antonio-RiveroMartnez Jul 5, 2023

Choose a reason for hiding this comment

betodealmeida Jul 6, 2023

Choose a reason for hiding this comment

john-bodley left a comment

Choose a reason for hiding this comment

john-bodley Jul 5, 2023

Choose a reason for hiding this comment

Antonio-RiveroMartnez commented Jul 5, 2023 • edited Loading

betodealmeida left a comment

Choose a reason for hiding this comment

betodealmeida Jul 6, 2023

Choose a reason for hiding this comment

john-bodley commented Jul 6, 2023

codecov bot commented Jul 3, 2023 •

edited

Loading

Antonio-RiveroMartnez commented Jul 5, 2023 •

edited

Loading